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Abstract 

We present a high-order accurate boundary-based solver for three-dimensional (3D) frequency-domain scattering from 
a doubly-periodic grating of smooth axisymmetric sound-hard or transmission obstacles. We build the one-obstacle 
solution operator using separation into P azimuthal modes via the FFT, the method of fundamental solutions (with 
N proxy points lying on a curve), and dense direct least-squares solves; the effort is 0(N^P) with a small constant. 
Periodizing then combines fast multipole summation of nearest neighbors with an auxiliary global Helmholtz basis 
expansion to represent the distant contributions, and enforcing quasiperiodicity and radiation conditions on the unit 
cell walls. Eliminating the auxiliary coefficients, and preconditioning with the one-obstacle solution operator, leaves 
a well-conditioned square linear system that is solved iteratively. The solution time per incident wave is then 0(NP) 
at fixed frequency. Our scheme avoids singular quadratures, periodic Green’s functions, and lattice sums, and its 
convergence rate is unaffected by resonances within obstacles. We include numerical examples such as scattering 
from a grating of period 13 A x 13 A comprising highly-resonant sound-hard “cups” each needing NP = 64800 surface 
unknowns, to 10-digit accuracy, in half an hour on a desktop. 
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1. Introduction 

The control of waves using periodic structures is crucial for modern optical, electromagnetic and acoustic devices 
such as diffraction gratings, filters, photonic crystals and meta-materials (451, solar cells (H, and absorbers (641 [27l . 
Periodic scattering problems also arise in monitoring (63l or imaging El a patterned structure. Outside of asymptotic 
regimes where analytic models are useful, efficient and accurate numerical simulation is key to assess sensitivity to 
changes in parameters, and to optimize those parameters to improve device performance. 

Here we present a solver for 3D acoustic scattering from a doubly-infinite array of isolated axisymmetric objects, 
as shown in Fig.[2a). With acoustic applications in mind, we focus on the Neumann (sound-hard) boundary condition 
(including resonators) and transmission problems, and on highly accurate solutions. Sound absorbing surfaces often 
involve periodic structures such as perforated slabs |[64j Sec. 6.5.4], resonators |[64j Sec. 9.2.3] (29l Ch. 12], or wedges 
(as in anechoic chamber walls) (29j Fig. 12-13]. Recently, there has also been interest in acoustic meta-materials (25\ 
or “phononics”, including new phenomena such as anomalous transmission (25l Ch. 4] and acoustic cloaking ca. 
The new techniques we present are relatively simple to extend to multilayer geometries (such as perforated slabs) and 
poroelastic media (such as foams) common in noise control (65l . We view this work—in particular the periodizing 
scheme, which is very general—as a step toward 3D multilayer periodic boundary-based solvers (generalizing recent 
work in 2D El) for acoustics, coupled acoustics-elastodynamics, the Maxwell equations, and Stokes flow. 

Let us explain where our contribution fits into the bigger picture. Direct volume discretization methods including 
the finite element |[6||63l and finite difference time-domain imiiQi are common for acoustic scattering problems, and 
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can be successful for low-to-medium frequencies and accuracies. However, low-order finite elements suffer from the 
accumulation of phase errors across the domain (known as “pollution” 0 ), meaning that an increasing number of 
unknowns per wavelength are needed as the wavenumber k grows, so that greater than 0(P) unknowns are needed 
to maintain accuracy. High-order finite elements, while showing recent promise for medium-frequency acoustics at 
accuracies of a couple of digits ifTTl . are tricky to generate in complex geometries, and have not been used for the 
periodic problem as far as we are aware. Time-domain methods (e.g. FDTD) also suffer from a low convergence 
order (usually at most Ist-order in the presence of surfaces; recent work has recovered 2nd-order 1401 ). difficulty in 
modeling impedance boundary conditions (321, and very long settling times if a structure is resonant. It should be 
noted that via the Fourier transform, FDTD can solve many frequencies in a single shot. However, accuracies from 
such methods are commonly 1-2 digits, even with dozens of grid points per wavelength (401 . 

For piecewise-uniform media, boundary-based methods become more efficient than direct discretization once the 
geometry is more than a couple of wavelengths across, and/or if an accuracy beyond a couple of digits is needed. Only 
N = 0(k^) unknowns are needed for a smooth obstacle if a high-order surface quadrature is used. The most common 
approach is the boundary integral method (BIE), in which the scattered wave is represented via potential theory using 
the Helmholtz Green’s function 

Gk(x,y):=— -x,yeMV (1) 

47r|x - y| 

For the mathematical foundation of this method in the non-periodic setting see ll22ll^ . and in our periodic setting 
m. Formulation as a 2nd kind Fredholm integral equation on dkl, the boundary of an obstacle Q c has the 
advantage that the condition number of the discretized linear system remains small independent of A/" as ^ is held 
fixed. Using the fast multipole method (FMM) (3711201 to apply the dense matrix discretization of the operator inside 
an iterative Krylov method solver such as GMRES ( 68 l can create an 0(N) solver (at moderate frequencies). However, 
in practice two problems plague this otherwise attractive scheme: 1 ) for resonant or geometrically complex objects 
the large number of eigenvalues close to the origin causes the number of iterations to be unreasonably large, and direct 
solvers can be orders of magnitude faster (33l . 2) high-order surface quadratures in 3D are quite challenging and are 
still an area of active research |[l7l|23|66l[I3[ini. We note that low-order Galerkin methods are the most commonly 
used in Helmholtz problems (67l . The method we propose fixes both of these problems in the axisymmetric setting. 

We use a close relative of the BIE, the method of fundamental solutions (MFS, also known as auxiliary sources (TOl 
or the charge simulation method (52l ). in which (for sound-hard scattering) the scattered wave has a representation 

N 

( 2 ) 

i=i 

where e 7 = 1,..., TV, are A/ source points covering a source surface F_ lying inside the obstacle Q but close to 
its boundary dkl. The coefficients [cj] are the solution to a linear system set up by matching to the boundary data at 
collocation points on dkl. The MFS idea (see the review (3T]| ) is old, being first proposed by Kupradze-Aleksidze (551, 
and is common in the engineering community (28l . The MFS may be viewed as an exponentially ill-conditioned first- 
kind BIE. Yet, when combined with a backward stable linear solver it can achieve close to machine precision when the 
source points are chosen correctly Q. An optimal choice of source points remains one of the more ad-hoc aspects of 
the MFS, although we demonstrate in Sec. |3.2| an excellent scheme for analytic boundaries that requires only a single 
adjustable parameter. The MFS has a significant advantage over BIE: because F_ is separated from 5Q, no singular 
surface quadratures are needed (either for the self-interaction of the object or for evaluation of u close to dO). This 
will enable us to handle Neumann and transmission conditions simply (in contrast, the robust BIE formulation in the 
Neumann case requires handling the derivative of the double-layer operator either as a hypersingular operator (53ll or 
using Calderon regularization EH Sec. 3.6] ID, and in the transmission case the compact difference of such operators 
( 22 I Sec. 3.8]). The MFS has proven useful in 2D acoustic settings (65]| . 

In the axisymmetric case the MFS becomes more efficient EollTOlinilTll, because the problem separates into P 
angular Fourier modes that may be solved independently, each with a small number of unknowns TV, the total number 
of unknowns being TV = NP. As we will show, many smooth obstacles up to lOT in diameter need only TV < 10^ for 
10-digit accuracy, with P of order one hundred. Since dense least-squares solves of such a size TV are cheap, the good 
conditioning of a BIE approach confers little advantage over using MFS, at least for smooth domains. (Recent work 
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also shows that several digits of accuracy is possible with the MFS in corner domains ||44|[60l.) We note that recently 
some technical challenges of high-order BIE on axisymmetric surfaces have been solved 11741 ITOl42ll43]| . but not in 
the case of transmission boundary conditions that we address. 

We now outline how we turn a scheme for solving the scattering from one obstacle into a scheme for a bi-infinite 
array of obstacles—we refer to this as “periodizing”; it is one of the main contributions of this paper. The standard 
way to periodize in 2D |[62l[T^ or 3D Ibbl l^fTSll is to replace the free-space Green’s function ([T]) by its quasiperiodic 
version which sums over all source points, 

Gf{x, y) := ^ a’^(3’'Gk(x, y + mCjc + nCy), (3) 

n,meZ 

where the Bloch phases are a and {3 (defined below in (0), and the array lattice vectors are and as in Fig.j^a). 
The above sum is notoriously slowly convergent, hence a host of schemes such as Ewald’s method (301 [461 mO, other 
spatial-spectral splittings (471, or lattice sums (67l|58l have been developed for numerical evaluation. These schemes 
are generally quite complicated, both analytically and in terms of implementation, and raise two major problems: 

1. While they are able to fill the elements of a dense matrix, most such schemes are incompatible with the 
EMM or other fast algorithms. Exceptions are the lattice-sum based correction to the FMM of Otani et al 1^ 
and the rolled-off spatial sum of Bruno et al (H. 

2. At certain sets of parameters {a,(3) and k called Wood anomalies, the quasiperiodic Green’s function does not 
exist, i.e. the sum diverges, even though the solution to the diffraction problem remains well-posed and 
finite. 

We propose a simple new approach, following OITllETlIini. which cures the first problem. In the setting of contin¬ 
uous interfaces such as ED, our approach would also cure the second; however, since we focus on isolated obstacles, 
then (as with the work in (67l[T8l) we do not attempt to address Wood anomalies here. Our representation restricts 
the sum ^ to only the 3 x 3 block of nearest neighbors, adding an auxiliary spherical harmonic basis (of maximum 
degree p) for smooth Helmholtz solutions in the neighborhood of the object, 

N p I 

uix)^Y^CjG"“(x,yj)+'^'^dimji(kr)Yim(0,^), where G“”(x,y) := ^ a'"J3"Gk{x,y + me^ + nCy). (4) 

7=1 1=0 m=-l |m|,|n|<l 

Here (r, 6, (p) are spherical coordinates. The auxiliary basis represents the contribution from the remainder of the lattice 
of images; since they are far from Q, this has rapid exponential convergence with p. An expanded linear system is 
used to solve for the coefficients. Being rectangular and ill-conditioned (as with the plain MFS method), this cannot 
be solved iteratively in the case of large N. In Sec. |4.2| we show that, because of axial symmetry, preconditioning is 
possible using a direct factorization of the MFS matrix pseudoinverse for the single obstacle. In the iterative scheme, 
the contributions from the near images are applied using the FMM, to give a scheme that solves each new incident 
wave in 0(N). Our work can thus be seen as a periodic generalization of the fast multibody scattering work of 
Gumerov-Duraiswami for spheres (381 , of Gimbutas-Greengard for smooth scatterers (36l . and of Hao-Martinsson- 
Young El for axisymmetric scatterers. 

One novel aspect is the high accuracy we achieve (around 10 digits) compared to most other periodic integral 
equation work EzlIIll. This is true even for resonant obstacles, thanks to the direct solve used for the isolated 
obstacle. Our scheme is very practical for periods up to a dozen wavelengths in each direction, but cannot go much 
higher than this in reasonable CPU time due to the O(p^) scaling of the dense matrix operations. However, this covers 
the vast majority of diffraction applications, where typically the period is of order the wavelength, and also allows 
“super-cell” simulations, for instance for random media. 

A similar scheme has recently been proposed by Gumerov-Duraiswami (39l for periodizing the 3D Laplace 
equation, using proxy sources instead of the auxiliary basis in 0; however, in Appendix A we show that for our 
application to the 3D Helmholtz equation spherical harmonics are much more efficient. 

Our paper is organized as follows. In Sec. we state the two periodic scattering boundary value problems under 
study, namely Neumann and transmission conditions. In Sec.we explain the MFS in the axisymmetric one-obstacle 
setting, including the choice of source points, and study its convergence when using a dense direct solve for each 
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Figure 1: (a) Periodic scattering geometry in 3D. (b) Sketch of the method of fundamental solutions (MFS) for solution of an exterior BVR (For 
simplicity, the 2D case is shown.) (c) MFS for a transmission problem (again, the 2D case is shown). 


Fourier mode. In Sec. we present the periodizing scheme and the resulting full linear system, and then show how 
Schur complements can turn this into a well-conditioned square system that can be solved iteratively. Numerical 
results are presented in Sec.[^ then we conclude in Sec.[^ The appendix compares the efficiency of our periodizing 
scheme with a variant using proxy points. 

2. Formulation of the boundary value problems 

Let c be an axisymmetric obstacle with boundary dQ defined by the rotation about the z-axis of a smooth 
curve y lying in the p-z plane, where (p, 6, z) define a cylindrical coordinate system. For simplicity we consider a 
rectangular lattice defined by vectors ej^ = (ex, 0,0) and e^ = (0, ey, 0), so that the grating of objects (see Fig.^a)) is 
defined by Qa := {(v,y,z) e : (x mex,y + ncy, z) e O, m,n e Z}, and is assumed not to self-intersect. Note that 
our scheme will also apply to general (possibly skew) lattices and a general (fixed) axis of symmetry of the objects 
with minor changes in bookkeepping. An incident plane wave (representing pressure variation of a time-harmonic 
acoustic wave), 

uXx) = e^^''^, X := (x,y,z) G (5) 

impinges on this lattice, with given wavevector k = (kx, ky, k^) whose free-space wavenumber is k := |k|. This incident 
wave is quasiperiodic in the following sense. 

Definition 1. A function u ^ is said to be quasiperiodic with Bloch phases a and p, if 

a~^u(x + ex,y,z) = P~^u(x,y + ey,z) = u(x,y,z) V(x,y,z) e . (6) 

The incident wave parameters fix the Bloch phases 

with \a\ = \p\ = \, and u' is quasiperiodic with these phases. The resulting scattered wave u will share this quasiperiodic 
symmetry. As usual in scattering theory ||23]| , the physical wave outside the lattice of objects is the total U = U + u. 
The scattered wave obeys the exterior Helmholtz equation 

Am- f = 0, inM^\QA (8) 

and the upwards and downwards Rayleigh-Bloch radiation conditions (131 [HI 

u{x,y, z) = Z - zo)] , z>zo, (x,y) e (9a) 

m,neZ 

u(x, y,z)= Z ^^p + x^y + kZ’"\-z - Zo)], z < -Zo, (x, y) g R^ (9b) 

m,neZ 
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where zo is such that Q lies between the planes z = +zo, and where := + Inmlex^Ky := ky Innley and 

^m,n) _ (a:^)2 _ . define the plane wave wavevectors (/c^, Ky, Note that the sign of the square- 

root is taken as positive real or positive imaginary. These conditions state that u can be written as a uniformly 
convergent expansion of quasiperiodic plane waves of outgoing or decaying type away from the lattice. In applications 
the far field amplitudes amn and hmn are the desired quantities, giving the radiated strengths in the various Bragg orders. 


We will solve two cases of obstacle scattering: 1) Neumann (sound hard) boundary conditions scattering from an 
impenetrable obstacle, 


du du' 

, on 

dn dn 

(10) 

and 2) Transmission conditions, with a new wavenumber k- and quasiperiodic scattered wave u~ 
incident wave u' being defined as zero inside Q), i.e.. 

inside the object (the 

Au~ k-^u~ = 0, in Q 

(11a) 

u - u~ = -u' on dkl 

(11b) 

du du~ du' 

- -— = ondn. 

dn dn dn 

(11c) 


For simplicity, ( |llc| ) models the case where the interior and exterior densities are equal (a density difference would 
result in prefactors here (231 Sec. 2.1]). Note that, due to quasiperiodicity, the above PDF and boundary conditions 
need only be defined on a single copy of the object in the lattice. 

Given an incident wave u' with wavevector k, the full boundary value problem (BVP) is defined by ([^-([^, radia¬ 
tion conditions (|9a|)-([%|), and either Neumann condition ( p^ or transmission conditions ( |lla| )-( pT^ . The following 
rigorous results are known about the BVPs. With a general smooth obstacle shape Q and fixed wavespeed ratio (ma¬ 
terial property) k-jk, the transmission BVP has a solution for all incident wave directions and frequencies k > 0 (691 
Thm. 9]. The solution is unique for all but possibly a discrete set of frequencies, for each incident wave direction (6^ 
Thm. 8]. Such frequencies are referred to as hound states of the grating (691 , and correspond to physical resonances 
of the BVP. Similar statements hold in the Neumann case. In the transmission case, \fk- <k and all lines parallel to 
the z-axis intersect dQ. at only two points, then no such bound states can exist (691 Thm. 13]. A similar analysis for a 
connected interface (or multiple such interfaces) is carried out by Arens O. 

If = 0 for any pair of integers (m,n), this defines a Wood anomaly (59l|63, where one (or more) of the 
Rayleigh-Bloch waves is constant (non-decaying) in the z direction. Although does not exist at Wood anomalies 
(it diverges like an inverse square-root with respect to the incident wave parameters), the BVP remains well-behaved, 
i.e. well conditioned with respect to varying the amplitude of u\ 

Remark 1. Because a Bragg scattering mode is ‘'on the cusp of existence” at a Wood anomaly, the rate of change of 
scattering coefficients Omn cind bmn ^ith respect to incident angle or frequency k diverges there (as an inverse square 
root singularity). Thus, for example, at a parameter distance from a Wood anomaly, we expect to lose around 5 
digits of accuracy purely due to round-off in the representation of the input parameters. 

In this work, we assume that we are not at a Wood anomaly. However, we will find that the method we present in 
Sec.j^loses of order the same number of digits as discussed in the above remark. 

3. Method of fundamental solutions in the axisymmetric setting 

In this section we present the MFS for scattering from an isolated axisymmetric obstacle, which will form a key 
part of the periodic solver. We present the Neumann case first, and then explain how the transmission case differs. 
Thus we care to solve the BVP given by ^ in M^\Q, boundary conditions and the usual 3D Sommerfeld 
radiation condition, 

du 1 

-- iku = o(r ^), r \= |x| ^ oo (12) 

dr 

where the convergence implied by the little is uniform in angle. We exploit standard separation of variables to 
represent the incident U and scattered wave solution as a sum of azimuthal Fourier modes. Each mode results in an 
independent linear system. 
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Figure 2: Obstacle bodies of revolution tested in this work. Above is the 3D surface, and below the corresponding 2D generating curve and N 
MFS points (with M ^ 1.2iV boundary points). Color above shows the real part of the incident wave u" at the highest frequency tested k = 40 in 
Sec. |3.3| restricted to the surface. From left to right: (a) “Smooth” shape given in polar coordinates in the p-z plane by r(6) = 1+0.3 cos 46. The 
N = 50 MFS sources are shown with distance parameter is r = 0.1. (b) “Wiggly” shape r(6) = 1 + 0.3 cos 80. r = ±0.03. N = 100. (c) “Cup” 
shape given parametrically in 0 < f < tt by r(t) = I - a erf [(f - 7r/2)/fl], 6(t) = b-a + 2(l - ^)sa(t - 7r/2) where the half-thickness is a = 0.2 and 

opening half-angle b = 7r/6, and the a-rounded abs-val function is defined by Sa(x) := (c/ + erf(x/fl). r = 0.05. N = 100. 


(a) 


(b) 




Figure 3: (a) Illustration of MFS for the exterior axisymmetric BVP showing a “ring charge” (dotted circle, where dots show approximation by 
individual point sources), and target ring (solid circle), (b) Geometry for periodization scheme: Surface points (green), MFS source points (red), 
and box wall discretization points (blue). 
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Let the generating curve y for the body of revolution about the z-axis be parametrized by the smooth functions 
(p(0,z(0) in the p-z plane, with t e [0, tt]. The “speed function” ^’(0 = is non-vanishing. Note 

that the surface must also be smooth at t = 0 and t = tt, ie z'(0) = z'Ctt) = 0. For example, the generating curves we 
test are shown in the lower part of Fig. note that (c) is not formally smooth at t = 0 and tt, but is smooth to within 
double-precision rounding error because of the exponentially small deviation of the erf (error function) from +1 at 
large arguments. 

The right-hand side surface data is / = -du' Idn, which we approximate by the F-term truncated Fourier 
series 

PI2 

f(p,z,(p)~ ^ fn(p,z)e‘"‘>’, {p,z)&y, (13) 

n=-PI 2+1 


with coefficients 


- 1 

/„(p, •^) = ^ J /(P’ , neZ, (14) 

whose accurate numerical evaluation we present shortly. P is even, and the asymmetry of the ±P/2 terms will have 
no significant effect. The MFS source points {(p'., z'j)}^^^ live in the p-z plane, inside the curve y\ their location choice 
is discussed in the next section. The rotation of the jth source point an angle ip about the z-axis is denoted by 


in cylindrical coordinates. Our representation for the scattered potential will be in terms of Fourier mode Helmholtz 
“ring sources”, defining for the ^th mode evaluated at target x, 

1 

0„j(x):=—J^ Gdx,yj(<p))e-''‘^d<p, (15) 

recalling that is the free-space fundamental solution 0. 

Our ansatz for the scattered potential is then 

N PI2 

u(x) ^ z z 

j=l n=-P/2+l 


where the complex unknowns Cnj are stacked into vectors c„ := {Cnj}^=i grouped by Fourier mode. We write rj := 
[c-p/ 2 +i ;...; Cp/ 2 ] for the A/^F-component column vector of all unknowns. 

Imposing the boundary condition ( p^ means, for all surface points x = (p,(p, z) in cylindrical coordinates. 


^/2 . ^27: 

z Z- 4 X 

i=-P/2+l j=\ -"O 


^^{{p,(p,z),yj{tp))e '"‘fd<f 


PI2 


n=-P/2+l 


(p, z) € 7, 0 <(p <271 


By changing variable pto p - (p, and using orthogonality of the modes it is easy to derive that, for each mode n 
separately, 

N 

'2^CnjK(p,Z-,p'j,Zj) ~ fnip,z) , (p, z) € y (17) 

j=l 


should hold, where the target normal-derivative of the ^th “ring kernel” from source point (p', z') to target (p, z) (both 
lying in the p-z plane) is 


K(p,z-,p',z')-.-- 


2?r Jo 


dGic 

dn^ 


((p,0,z),(p',<p,z'))e-‘''‘^d^, 


(18) 


where is the normal to y at (p, z). Finally, we enforce at a set of M collocation points {(p^, Zm)}^=i on y to give 
the set of P independent rectangular linear systems 


N 

^ Zmn PZ j)Cfij = fn(Pm^ Zm) r> 

j=l 


m = 1 ,...,M, n = -F/2-r l,...,F/2 . 


(19) 
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Figure 4: Convergence of the absolute error in evaluating the Helmholtz ring kernel (22}, for the various n shown in the caption with k = 10. 
The shape we used is the one shown in Fig.|^a) with N = 200, r = 0.1. The ^-node periodic trapezoid rule is used, (a) Small distance: target 
(p,z) = (0.5,0.49) and source (p',z') = (0.45,0.44). (b) Large distance: target (p,z) = (0.6,0.2) and source (p',z') = (0.45,0.44). In each case the 
red line shows exponential convergence with rate equal to the supremum of the a predicted by Thm.[^ 


Typically M must be slightly larger than N to ensure accurate enforcement of the boundary condition. As is standard 
with the MFS, these systems are highly ill-conditioned, but if the sources are well chosen, are numerically consistent 
and possess a small coefficient norm ||c„||. Assuming this norm is 0(1), if a backward-stable least-squares solve is 
used, machine precision accuracy can be reached despite the ill-conditioning Q. For each n independently the system 
is now solved in this way. 

The set of linear systems may be written as 





C_p+1 

2 


/-f+f 


a; 

2 


Cp 

2 


- h - 


summarized by A'rj = f , 


( 20 ) 


where A' has diagonal block structure, each of the P diagonal blocks being a dense ill-conditioned rectangular matrix 
of size M by N. Each of the P systems will be solved independently in the least-squares sense using standard dense 
direct methods based on the QR decomposition (we use MATLAB’s mldivide). The only remaining task is to hh 
their matrix elements and right-hand side vectors. 

Numerical evaluation of the RHS fniPm^Zm) for each curve point m is done by applying the ^-node periodic 
trapezoid rule quadrature |[54j Sec. 12.1] to ([H.i.e., 

1 ® 

fn(p,z) ~ - 'y f(p,2nllq, , where f := {f{p,2nllq,z)}y , neZ, (21) 


where is the ^-point discrete Fourier transform matrix. The error in this quadrature formula (“aliasing” error) can 
be bounded by the sum of the magnitudes of the exact Fourier coefficients /„ for which \n\ > ql2 |[72| (3.10)]. Since, 
for an accurate BVP solution, P must already be large enough that coefficients are small beyond F/2, we may choose 
q = P without losing accuracy due to quadrature in ^T\) . Note that ( |2T] ) can be evaluated for ah -Pjl < n < Pjl at 
once using a single FFT in 0(P log P) time. Since u' is analytic and the surface smooth, the Fourier data /„ decays 
super-algebraicahy with \n\, and thus we expect similar convergence with respect to P. 

We use a similar idea to evaluate matrix elements of A^, which we now describe. 


3.1. Evaluation of the axisymmetric Helmholtz ring kernel 

The m, j matrix element of A^ in the linear system ( p^ is given by the target normal-derivative of the Helmholtz 
ring kernel 


AniPm’i Zm 




n ((Pfi 

onm 
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where indicates the normal to y at the point (p^, Zm)- For the transmission BVP (Sec. |3.4| ) we will also need the 
ring kernel value matrix element 


p27T 

Anipm, Zm\p'p Zj) j Gk((pm, 0, Zm), (p'j, Zj))e~''''^dcp . (22) 

Analytic series for this kernel are presented by Conway-Cohl 1241 . and evaluation methods for them in the BIE setting, 
where the target may come very close to the source, are given in ||74l[4TJ[42l|43l. However, in the MFS setting, the 
target (p, z) is separated from the source (p', z') by a distance of at least a few times the quadrature node spacing (see 
Sec. |3.2[ ), and hence the periodic trapezoid rule quadrature 


^niPmri Zm^ Pj-> Zj) ~ 


1 ^ 

- GkdPm, 0, Zm), (Pj, Inljq, 

^ i=\ 


(23) 


will be efficient for evaluating ( [22| ). In fact, we will have exponential convergence with a rate controlled by the 
source-target separation in the p-z plane, as the following theorem shows. 


Theorem 2. The q-node periodic trapezoid rule applied to the Helmholtz ring kernel or it’s target normal derivative 
is exponentially convergent with known rate a. Namely, for any a <lm cos“^ ( ^ ^ there is a constant C 
independent of q such that 


f f " G,((p, 0,2), (p', --V G,((p, 0,2), (p', iTTlIq, z'W 

2n Jo q^ 


■Ininllq 


l=l 


< Ce- 


(24) 


for all sufficiently large q, and the same holds when Gk is replaced by n^ being the normal to y at (p, z). 

Proof The theorem of Davis 1261 states that the periodic trapezoid rule is exponentially convergent for a periodic 
analytic integrand. Specifically, for the periodic interval <p e [0, In), if an integrand f{jp) can be analytically continued 
off the real axis to a bounded 27r-periodic analytic function in the strip | Im ^| < a, then 


1 

In 


^7.n 1 ^ 

md^--Yf{lnllq) 

Jo 


< Ce-“« 


holds for some constant C and all sufficiently large q. It only remains to understand the region of analyticity of the 
integrands 

f{ip) := Gi(x, and f{ip) := ^(x, , where x = (p, 0, z), y = (p',(p, z), 

dn^ 

with respect to the variable ip. The Green’s function Q is analytic unless the distance between source and target 
vanishes, ie 

0 = |x - y| = ^J(p-p' cosp)'^ -r (p' sin^?)^ + (z - z')^ • 

Solving for p gives locations of the singularities in the complex p plane. 


p = cos 


ip-p'f + {z-z'f ^ ^ 

2pp' 


These occur with 2n periodicity in the real direction, in pairs symmetric about the real axis. Thus the two integrands 
f(p) are analytic and bounded in any strip | Im ^| < a for a given in the statement of the theorem. Applying the Davis 
theorem completes the proof. □ 
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Note that in the limit of small separation, by expanding the cosine, the maximum convergence rate guaranteed by 
Thm.|2]is 

^|(p - p'f + {z- z')^ 

a ^ — -, 

P 

that is, equal to ratio of the source-target separation in the p-z plane to p. 

In Fig. 1^ we demonstrate Thm. numerically in a typical axisymmetric MFS setting. Panel (a) shows the error 
convergence in evaluating ( [22l ), i.e. one matrix element of A„, for a small source-target separation of 0.071. The 
convergence rate is slow, independent of n, and observed to match very closely the upper bound on the rate a predicted 
by the theorem. Panel (b) shows a much larger distance, and much faster convergence, again with rate independent of 
n and as predicted by the theorem. The results for the normal derivative ring kernel ( p^ are similar. 

Notice that the value of q after which converge starts depends on the Fourier mode n, and that, as n increases this 
adds roughly \n\ to the q value at which convergence to machine precision is reached. We explain this as follows. 
In the Davis theorem the constant C asymptotic to 47rsupjjj^^^+Q, l/(^)l, i-e. a bound on the size of / in the strip 1541 
Thm. 12.6]. The factor in the integrand contributes a maximum value within the strip of which is the 
dominates C. Thus around \n\ extra quadrature nodes are needed to reach the error level for ^ = 0. 

Remark 2. Since the individual terms in the quadrature sum are of size 0(1) for a typical source-target separa¬ 
tions in a unit-sized obstacle, we can get convergence to machine precision in the absolute error, but not the relative 
error, of each matrix element ofA'^ or A„. However, absolute error is the relevant quantity controlling final accuracy 
since the solution field is summed over Fourier modes 


In the rest of this paper, we fix q to the same value for filling all matrix elements in A^ or A„, determined by a 

It would be possible to use that above theorem to choose q differently to be 


convergence test as shown in Sec. 3.3 


more optimal for each matrix element (source-target distance); however, since the matrix filling is not a large part of 
the total solution time there would not be much benefit. For evaluation of the solution potential ( p^ on other periodic 
images of the obstacle, we find that a fixed value ^ = P is adequate. 


3.2. Choice of source point locations 

The performance of the MFS depends critically on the choice of the curve F_ on which the N sources lie 13X1 [^ . 
For analytic boundaries there are strong theoretical results. In this case, Katsurada ED proved that, in exact arithmetic, 
the MFS has exponential convergence for the 2D Dirichlet Laplace (k = 0) interior BVP, with M = N and points 
chosen equally spaced in the parametrization of the curves F and F_. This was generalized to all analytic kernels 
(thus including the Helmholtz kernel) by Kangro (481, and recently to the 3D Maxwell case when the source points 
{y^} are the nodes of an exponentially convergent quadrature on F_ (491 (presumably a similar proof would apply for 
Helmholtz). 

However, in floating-point arithmetic another constraint arises, associated with the exponentially large condition 
number of the MFS system matrix. Namely, the coefficient norm ||c|| should remain small, i.e. 0(1). It is conjectured 
that this happens if and only if the exterior solution u can be continued as a regular solution to the PDF inside Q up 
to and including F_, in other words if F_ encloses all of the singularities in the continuation of u iHConj. 12]. This is 
suggested by theory on the continuous first-kind integral equation (56l p. 1238] (28l Thm. 2.4], and numerical results 
where the singularities are known (via the Schwartz function) (71 . If F_ is placed too far from the surface, it will not 
enclose these singularities, and exponential blowup of ||c|| results, causing rounding error which limits the solution 
error for u. 

Now we compare the A^-convergence of the error in our setting, for various different methods for choosing source 
locations. We use the notation that the sources y^ := (pj,Zj), j = 1,..., lying on a curve F_ are displaced from N 
surface points Xj \= (p(tj), z(tj)) with parameters tj = n(j - l/2)/A^, j = 1. . .,N, uniformly spaced on the generating 
curve y. The methods we compare are: 

(a) displacement by a constant distance r in the normal direction, 

(b) displacement by a distance Ts(tj) in the normal direction, where s(t) is the speed function, and 

(c) displacement in the “imaginary direction” by complexification of the boundary parametrization, i.e. y^ := (p(tj-\- 
iT),z(tj + ir)). This is a simplification of methods from Q. 
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Figure 5: Convergence of error at a single distant point in the solution of Neumann Helmholtz scattering problems at = 10 via the MFS for 
various source point displacement schemes, for the boundary curve the “smooth” shape of Fig. 0a), f(0) = 1 + 0.3 COS 40 for 6 e [0,;:]. The top 
row of plots shows the error for solution of a 2D BVP using a boundary curve reflected to form a closed curve 6 e [0, 2n]; the bottom row shows 
error for the corresponding 3D axisymmetric BVP. N is the number of source points in 0 G [0,;:]. (a) and (b): constant displacement from surface 
points along the normal vector, (c) and (d): scaling the displacement in proportion to the “speed” \s(t)\. (e) and (f): constant displacement in the 
imaginary parameter direction. In each plot “aliasing” error due to sources too close to the boundary dominates in the bottom left, and growth of 
error in the vertical direction is caused by round-off due to growing coefficient norms ||c||. 
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Note that all methods depend upon the parametrization, and we assume that one is used for which {x^} provides a good 
quadrature scheme on y (combined with its reflection). 

In Fig. we compare these three methods for solving the Neumann Helmholtz scattering problem in the smooth 
shape shown in Fig. |^a), using a dense direct linear In fact we compare the 2D case of plane-wave scattering from 
the curve (closed by combining with its reflection about the z axis), against the results for the 3D axisymmetric case, 
using a dense direct solve for each Fourier mode. Two conclusions stand out: 1) the 2D BVP is an excellent indicator 
of error performance in the 3D axisymmetric case, and 2) the complexiflcation method is at least as good as the other 
methods at small r, and is more forgiving at large r. Thus, in the rest of this work we choose the complexiflcation 
method. 

The tested wavenumber k = 10 is not high; we And that at ^ = 30 the blow-up at large r is more severe even for 
the complexiflcation method. Thus we choose r empirically so that it gives rapid N convergence but does not blow 
up at high k. This we assess via experimentation with the 2D BVP, since is very rapid to run (taking a fraction of a 
second per solve), and is demonstrated to reflect the 3D performance. In future work we will present an automated 
method for choosing r and N. 

Remark 3. The monopole (‘"single-layer”) MFS representation that we use fails to be a complete representation of 
exterior radiative potentials if the surface on which the sources lie has an interior Neumann eigenvalue equal to 
Sec. 2.1]. In practice, this can be detected by a poor boundary error (e\ in Sec. \3.3\ below), and r changed 
slightly. In our experiments we have never had to make such a change. 


3.3. Isolated obstacle Neumann scattering convergence tests 

Thus far we have made three approximations, each of which involves a parameter: 1) the MFS approximation 
involving N source points, 2) the Fourier series truncation to P terms, and 3) the quadrature evaluation of the ring 
kernel via q periodic trapezoid nodes. We now show convergence results for these parameters in the context of solving 
the one-obstacle scattering problem ([8]), ( p^ and ( p^ . The first row of Fig. shows the convergence with respect 
to V, the second row with respect to P, and the last row with respect to q. In each case the parameters not under 
convergence study are fixed at their converged values. Two types of error are shown: 

• 6i: Absolute error in the boundary condition estimated at a grid ofl28xl28 points on dkl very few of 
which coincide with collocation points X/, and 

• 62 ’. relative error in the value of the scattered potential w at a distant point near to (10,10,10), compared to its 
converged value. 

Fig. is consistent with exponential convergence in all three parameters, with the values where convergence starts 
being larger for large wavenumber k. The convergence of the two error measures is similar, usually differing by a 
fixed constant, and saturating at different values. Even in the case of a resonant object at high frequency (the plots 
in the right column), 9 digits are reached in the boundary condition error, and 12 digits in the solution at a distant 
point. 

3.4. The transmission case 

Now we briefly describe how the above axisymmetric MFS scheme is adjusted for the transmission case ( |lla| )- 
( |llc| ). As before, the scattered wave u is represented by the ring kernel MFS source sum with source locations 
(p'j, z'j) inside the obstacle. In addition the scattered wave u~ inside the obstacle is represented by a similar sum 

N PI 2 ^ 

m" ~ y y Cnj^nj’ •= ^ I y-((p):=(p”,(p,z”), 

j=l n=-P/2+l -'O 

with sources lying outside the obstacle. The set-up is as in Fig.j^c) (which shows a 2D sketch), an example 

being the green points in Fig. |^b). These exterior source locations are chosen by negating r in the source point 
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Figure 6: Error convergence for the 3D axisymmetric MFS solution of the Helmholtz Neumann scattering BVP for low and high frequencies 
(wavenumber k is shown in legends). ei estimates the boundary condition error, 62 the solution error at a distant point. Left column: “smooth” 
shape from Fig. HJa), with T = 0.1. (a) varying N, (b) varying P, (c) varying q. The fixed (converges) parameter values are N = 260, P = 150, 
q = 400. Middle column: “wiggly” shape from Fig.[^b), with r = 0.03. Fixed values are = 1000, P = 180, ^ = 1200. Right column: cup shape 
from Fig.[^c), with r = 0.1. Fixed values are N = 500, P = 160, q = 800. In all cases M ^ \.2N. 
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location algorithms in Sec. |3.2| Imposing the value and normal derivative matching conditions ( |llb| ) and ( |llc| ) gives 
the linear system 

A _4-ir ^ ^ r f 

(25) 

where each of the four blocks is block diagonal, with rectangular diagonal blocks An,A~,A'^, A'~ respectively, -P/l < 
n < P/2. The new blocks A~ and A'~ are identical to ( |22| ) and respectively, but with k replaced by k-,p'j by pj, and 

z'j by z'y The RHS data blocks fn and f/ are constructed as before via ^2A\ using / = -u' and f' = -^ respectively. 
Abusing notation somewhat, we will summarize both ( [2Q| and ( [^ by the linear system 


^0^] = f. 

where Aq is of size MP-hy-NP in the Neumann case, and size 2MP-hy-2NP in the transmission case. 


(26) 


4. Periodizing scheme 

We choose a cuboid unit cell Qbox of sizes Cx and Cy in the x and y axes respectively, and truncated in the vertical 
direction to z e [-Zo, Zo] such as to contain Q. The rectangular walls enclosing Q^ox are left L, right R = L (Cx, 0,0), 
back B, front F = B + (0,ey, 0), and downwards D at z = -Zo and top T at z = zo; see Fig. [^b). All normals point 
in the positive coordinate directions. We use the abbreviation ul to mean u restricted to L, and UnL to mean du/dn 
restricted to L. We now reformulate the periodic BVP on Q^ox alone, as in ifTSlI^ . Taking for simplicity the Neumann 
case, u satisfies ^ in ([T^, and we match Cauchy data on the four side walls giving 



Ur - aUL = 0 

(27) 


UnR O^UyiR — 0 

(28) 


Up — I3ur = 0 

(29) 


UnF — 0 ? 

(30) 

which (because of the unique continuation property of an elliptic PDF) is equivalent to quasiperiodicity ([^, and finally 
match Cauchy data to Rayleigh-Bloch expansions ([9a|-([9b|) on the top and bottom walls. 

u(x,y,zo) = 

E exp i[i^x + /^y] , (x, y, zo) e T 

m,neZ 

(31) 

Uz(x,y,zo) = 

^ exp /[<x + K^y], (x, y, zo) e T 

m,neZ 

(32) 

u(x,y,-zo) = 

E exp i[K^x + K^y] , (x, y, -zo) e D 

m,neZ 

(33) 

Uzix,y,-zo) = 

- E exp i[K^x + K^y], (x, y, -zo) e D . 

m,neZ 

(34) 

The solution u in is represented as 



N 

PH P 1 



u(x)^Yj Z 

j=l n=-P/2+1 


1=0 m=-l 

where the sum of the ring kernel over the nearest neighbors in the lattice is 


E rE,(x,y/^) 


+ mCx + ney)e ”"^d(p . 


(35) 


(36) 


The unknowns are rj := {Cnj]j=i,...,N-Pi 2 <n<Pi 2 , d := {dim]i=o .p,|m|<;> and we restrict the Rayleigh-Bloch expansions 

to the xy-plane wavevectors of magnitude at most ttAq, where No is a convergence parameter, so a := 
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Remark 4 (Choice of zo)* If the vertical extent ofQ. is not larger than the period, then choosing D.i,ox roughly cubical 
is efficient. Generally, zo rnust be at least some distance (say, 1/4 period) above the extent of D. in order that the 
Rayleigh-Bloch expansions converge rapidly; on the other hand ifzo is too large, the spherical harmonic convergence 
rate on the faces is reduced. For high aspect ratio obstacles (not tested in this work), either more local terms are 
needed in ( [3^ , or an elongated proxy surface could be used (in the style ofApp. A; also see B)- 


4.1. Full linear system 

We now build the full linear system that the stacked column vector of all unknowns [ 77 ; d; a; b] must satisfy, by 
enforcing the boundary conditions, but also the quasiperiodicity and upward and downward radiation conditions. 

The first block row arises from enforcing on the boundary nodes on dQ. as in Sec. 0 giving 

Ar] + m = f (37) 

where 

A=Ao+Aeise, (38) 

Ao being the direct self-interaction of the obstacle as in ( [^ (i.e. Aq is zero apart from diagonal blocks A“). Rather 
than writing a long formula for the matrix elements of A^ke, it is more useful to describe its action: A^^^^p is the set of 
Fourier series coefficients of the normal-derivatives of u on the target rings due to the eight phased ring 

sources from the first term in ( [35] ), omitting the central copy m = ^ = 0. In practice we may compute A^^,^p efficiently 
as follows: the quadrature ( [^ is equivalent to replacing each ring kernel by q point sources with strengths given by 
the FFT of the coefficients Cnj in the n direction. The FMM evaluates the potential on the ^qN sources at a set of 
qM trapezoidal-node targets on rings on (9Q, and the FFT is finally used to convert back to Fourier coefficients as in 
A quadrature parameter q — P sufficient here, since interactions in Agj^e are distant. The cost of applying Agi^g 
(assuming M = 0(N)) is thus 0(NP log P), although it is typically dominated by the 0(NP) of the FMM. 

The matrix B in ( [TT] ) has elements that can approximated by the periodic trapezoid rule, 

B„nm = f f ji(kri)Yi„(0i, ~ “ X jiikrdYi^iOi, 2nslq)e~^™^^‘’, 

2n Jo 


where r, = -r z^ and 6i = tan ^ Zi!pi are the spherical coordinates of the ith boundary point on y. Note that if 
the axis of the obstacle is aligned with the z-axis of the spherical harmonic expansion, simplifications apply making 
B sparse; for more generality we leave it in the above form. We fill the matrix B once and for all at a given k, at a cost 
0(NPp^). 

The remaining block rows of the full linear system result from substituting ( [35] ) into ([77])-([34]), or, more specifi¬ 
cally, evaluated at collocation nodes lying on the faces. For these nodes we use the nodes from a Mi-by-Mi 

product Gaussian quadrature on each rectangular face, with M\ chosen large enough that further changes have no 
effect on the solution. We pick M\ to be Mi 4| so that we can guarantee 4 points per wavelength in each direction 
of the rectangular faces. 

The resulting full system has the form 


A 

B 

0 

0 

Cl,r 

Sl,r 

0 

0 

GnL,nR 

^ nL,nR 

0 

0 

Cb,f 

Sb,f 

0 

0 

CnB,nF 

S nB,nF 

0 

0 

Ct 

S T 

-Wt 

0 

CnT 

SnT 

-WnT 

0 

Cd 

Sd 

0 

-Wz) 

_ CnD 

SnD 

0 

-WnD. 


7 

0 

0 

0 

0 

0 

0 

0 

0 


summarized as 


A B 

q 


f 

c Q 



0 


(39) 


where ^ [d; a; b] groups the periodizing unknowns, and where the division of the matrix blocks defining the 2x2 

block notation is shown by lines. 
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The recipe for filling the above blocks is implicit in the above description, and rather than give their full formulae 
(see ED for full formulae in a related 2D problem), we explain their meaning. The A block is already given in ( [38] ). 
The B block is merely B padded with zeros to its right. The C block describes the effect of the ring kernel coefficients 
Cnj in Tj on the discrepancies, namely the left-hand sides of ([27])-([30|), and on the Cauchy data on T and D. Significant 
cancellation occurs in the upper four C blocks, identical to that in IH. Consider the 3 x 3 grid of nearest neighbors 
in phased image sums such as ( [3^ , in the block Cl,/?: the effect of the left-most six on the L wall cancels the effect 
of the right-most six on the R wall, leaving only 6 of the original 18 terms. The remaining terms correspond only to 
distant interactions: the effect of the right-most three on L and the left-most three on R. Identical cancellations occur 
in the next three blocks of C. As discussed in m, this also allows the scheme to work well even if Q is not confined 
within Qbox and wall-obstacle intersections occur. The C matrix is filled densely once and for all at each k, at a cost 
OiM^NP). 

Q has 8Mj rows and Nq := (p l)^ 0 (Nq) columns; note that this is independent of TV, the number of obstacle 

unknowns. However, both dimensions of Q grow with wavenumber as 0(k^). We fill its non-zero blocks densely by 
evaluation of spherical harmonics (for the S blocks) and plane-wave expansions (for the W blocks). The S blocks 
give the effect of the spherical harmonic basis on the discrepancies and Cauchy data on T and D, while the W blocks 
give the effect of the Rayleigh-Bloch expansions on and Cauchy data on T and D. The negative signs in the latter 
account for the fact that the jumps in Cauchy data should vanish. 

The transmission case is similar to the above, with A twice the size in each dimension (with Aq as given in ([25])), 
and [/;/'] replacing / in the RHS. Note that the representation of the interior potential u~ only involves Aq, i.e. a 
non-periodized ring kernel. 

In the next section we will demonstrate convergence with respect to the parameters p and TVq. 

Remark 5. Proxy source points have recently been proposed in a similar periodizing scheme for the 3D Laplace 
equation However, in App. A we show that for Helmholtz problems spherical harmonics are a much more 

efficient choice than proxy points. 

4.2. Rapid solution of the linear system 

Since we expect TV = NP, the number of columns of A, to be 10"^ or greater, the linear system ( [39] ) is too large to 
solve directly (in contrast to related 2D work El ID). Hence we wish to apply an iterative method for the obstacle ring 
source unknowns q. We eliminate the smaller number of unknowns in taking a Schur complement of ( [39] ), to give 

(A - BQ^Oq = f 

where is the Moore-Penrose pseudoinverse of Q. This is a rectangular system with poor conditioning similar to 
that of the axisymmetric MFS system A^q = / in ( [26] ). Note that A - BQ^C computes the one-obstacle interaction Aq 
but with the quasiperiodic Green’s function (I; this explains why the scheme breaks down at Wood anomalies. Using 
( [38] ) and right-preconditioning by A^, we would get the square system 

(AoA^ + Ae,eA^ - BQ^CA^)q = f , 

from which we can recover the solution q = A^q. However, since M > N and Aq is often rank-deficient in MFS 
applications, Aq has less than full range. AqAq is the orthogonal projector onto the range of Aq. To create a well- 
conditioned system which can be solved iteratively, we replace AqAq by the identity, since this has no effect on the 
resulting desired q, solving 

(/ + Ae,eA^-5e^CA^)77 = / (40) 

via a non-symmetric Krylov method such as GMRES. In effect we are working in a space of surface unknowns q 
rather than ring charge source unknowns q. Once q is known, ^ = [d; a; b] is reconstructed via 

^ = -Q^Cn , 

and the solution potential u can then be evaluated anywhere in Qbox using q and d. The solution in |z| > zo can be 
evaluated via ( [9a] ) using a or using b. 
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Figure 7: Convergence of errors for the periodizing scheme for the Neumann scattering from a grating of “smooth” objects as in Fig.|^a), at low 
frequency. Three types of error are shown: L 2 norm for the boundary condition, L 2 norm for the periodicity matching flux conservation, and (a) 
error vs p, flxing Nq = 15; (b) error vs Nq when p = 24. 


BVP/shape 

k 

N 

P 

P 

No 

Ml 

fill 

factor 

solve 

# iters 

4c 


4ux 

RAM 

Neum/smooth 

4 

150 

60 

24 

13 

24 

22 s 

4.8 s 

34 s 

12 

9e-ll 

2e-10 

8e-ll 

2GB 

Trans/smooth 

4,6 

150 

60 

24 

13 

24 

26 s 

6.6 s 

36 s 

12 

le-11 

le-10 

2e-ll 

2.5GB 


Table 1: Low-frequency periodic scattering example, for the “smooth” shape of Fig. 03) in a unit cell 1.3/1 in period. The columns show wavenum¬ 
bers (k, and k- when appropriate), numerical parameters, timings, three error metrics, and total RAM usage. The two rows are for Neumann 
boundary condition, and transmission condition. N is the number of MFS source points, P is the number of Fourier modes, p the maximum degree 
of the auxiliary spherical harmonic basis, Nq the maximum order of the Rayleigh-Bloch expansion. The column “All” reports the time to All the 
matrices, i.e. Ao,B,C and Q; “factor” reports the factorization time, i.e. doing the SVD on the matrix blocks of Aq and on Q; while “solve” reports 
the iterative solver time. 


A practical word is needed about handling the pseudoinverses. Because of axisymmetry Aq is block diagonal, so 
each block A„ can be inverted independently by taking its SVD to give A„ = Care must be taken to apply 

each A'^ correctly, otherwise a large loss of accuracy results: to compute A'^x for some x e C^, one uses 
The truncation parameter used in is taken as 10“^^. Since there are P blocks, the precomputation required for 
Aq is O(PN^), then each application takes O(PN^) with a very small constant. Note that filling Aj^ then using it for 
matrix-vector multiplication would be dangerous since it is not backward stable; A;^ may have exponentially large 
elements which induce catastrophic round-off error (see comments in (STJ Sec. 5] and 121 Sec. 3.2]). 

Similar care is needed for Q\ a dense SVD gives Q = UllV*. Assuming that y = CA^rj has already been computed 
as above, Q^y computed as yE^(f/*y). The SVD of Q takes O(M^Nq) time. Fixing the obstacle, the numerical 
parameters V, P, p, Nq, and Mi all grow as 0(k). Thus the time for the SVD of Q grows rapidly with wavenumber as 
0(k^). This limits the largest k in practice on a standard workstation to one in which the unit cell is around a dozen 
wavelengths in period. 

Remark 6. There are other fast solution methods for the linear system ( [39] ). For instance, one could instead eliminate 
77 if a fast application ofA~^ were available (this is done in the analogous 2D periodic scattering problem in /|3?I/). 


BVP/shape 

k 

N 

P 

P 

No 

Ml 

fill 

factor 

solve 

#iters 

4c 

4®r 

4ux 

RAM 

Neum/cup 

30 

240 

150 

70 

21 

38 

167 s 

297 s 

346 s 

57 

2e-10 

5e-ll 

4e-ll 

17GB 

Neum/cup 

40 

360 

180 

86 

25 

45 

420 s 

871 s 

736 s 

65 

2e-10 

2e-10 

4e-ll 

41GB 

Trans/wiggly 

30,40 

360 

200 

70 

21 

38 

440 s 

322 s 

756 s 

62 

7e-10 

4e-ll 

9e-ll 

58GB 

Trans/wiggly 

40,60 

400 

200 

86 

25 

45 

700 s 

934 s 

1090 s 

65 

le-10 

2e-10 

le-11 

93GB 


Table 2: Higher-frequency periodic scattering examples (lOT and ISA in period). The first two rows use Neumann boundary conditions on the 
resonant cup shape, and r = 0.03; the last two rows are for transmission conditions on the “wiggly” shape of Fig.j^b) and r = 0.1. Other notation 
is as in TablelU 
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Figure 8: (a) High-frequency periodic transmission scattering solution with k = 40, k- = 60, for the “wiggly” shape of Fig.j^b). (b) High-frequency 
periodic Neumann scattering solution with k = 40, for the cup shape. In both cases the full wave is shown on two slices, the incident wave 
shown on the obstacle surfaces, and the periodicity is 134. The direction of the incident wave is0=-|,0 = |,in spherical coordinates. 


5. Results 

We first present some details of our implementation. We use MATLAB R2013b on a desktop workstation with 
two quad-core E5-2643 CPUs and 128 GB of RAM. The tolerance for GMRES (MATLAB’s implementation) is set 
to 10“^^. We apply via the EMM, using MEX interfaces to the Eortran implementation by Gimbutas-Greengard 
ES, with the following modification: we set the internal parameter maxlevel=3 in the routine dStstrcr, which has 
the effect of limiting the depth of building the oct-tree to 2. Since all sources are well-separated from all targets in 
Aeise, this bypasses time spent propagating the source multipoles up the tree via M2M, and local expansions L2L down 
the tree to the targets. The result is a factor 1-3 speed-up over the vanilla EMM call from this library. We set iprec=3 
which requests 9 digits of accuracy in the EMM. Other spherical harmonic evaluations required for B, C and Q are 
done using a MEX interface to the recurrence relation based fortran libraries in EH, available at 
http://math.dartmouth.edu/~ahb/software/localexp3d.tgz 

The set of MATLAB codes we developed for the tests in this paper will be available at 
https://math.dartmouth.edu/~yliu/software/acper.tgz 

Remark 7. For the N we test in this work, the dense matrix blocks B and C fit in RAM, and thus we fill them once 
then apply then via standard BLAS2 matrix-vector multiplies. If they cannot fit in RAM, then they can be applied on 
the fly using the FMM with only a constant factor change in effort. The (smaller) Q matrix must be stored and inverted 
densely in our scheme. 

Errors for the periodic scattering problem are measured in three different ways: 

• 6bc- an estimate in the error in satisfying the boundary condition on dkl (defined as 6i was in Sec.|^, 

• 6per: an estimate of the error in satisfying the periodic boundary conditions on the unit cell walls L, R, F and 
B, and 
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6flux- the flux error giving the absolute value of the difference between incoming and outgoing fluxes, 




^flux 


+ \b^n + ) - 4 °’“^ 


Note that the incident wave corresponds to Bragg orders m = n = 0, and the phase shift is needed because the 
reference for b^n is at z = -zo. For the Neumann or transmission BVPs with real-valued k and k- (non-absorbing 
materials), this error should be zero. 

There are several numerical parameters, but they fall into two categories. The one-body solution method is con¬ 
trolled by N, M, P, T, and q (convergence with respect to these being shown in Sec .1^ whereas the periodizing scheme 
is controlled by p, No, and Mi (the latter being fixed at Akin as explained in Sec. |4~l| ). We thus first test convergence 
with respect to the periodizing parameters p and Nq. Fig. shows error convergence consistent with exponential, in 
the scattering from a grating of “smooth” obstacles from Fig. [^a) at low frequency. Note that is small throughout 
the parameter range, and thus cannot alone be used to verify that the full periodic solution has converged. Choosing 
the converged parameters p = 2A and No = 13, the timing and error results are then given in Tablefor the two types 
of BVP. There are N = NP = 9000 obstacle unknowns, at the borderline where the FMM starts to become useful. 
Here the size of Q is 3200 by 1387, making it rapid to invert (via SVD). Thus at low frequency, the computation is 
dominated by the FMM application of A^i^e needed in each GMRES iteration. 

We show error and timing results at higher frequencies in TableThe wavenumber k = 40, i.e. a unit cell period 
of 13/1 X 13/1, is around the largest that is practical on a single workstation, needing many minutes to take the SVD 
of the Q matrix of size 16200 by 11493. The SVD (factorization) and GMRES (solve) stages now take comparable 
times. The solution (total wave u' = u u') for the two highest frequency cases are shown on planes in Fig. For 
evaluation of u we used the FMM for the first term in ( [^ and direct spherical harmonic summation for the second. 
Note that the number of GMRES iterations is similar (around 60) for the transmission case as for the highly-resonant 
Neumann cup shape. The latter models a grating of acoustic Helmholtz resonators. This illustrates that, because the 
one-obstacle problem is factorized, the iterative part of the scheme is immune to obstacle resonances, in contrast to 
the case where GMRES is used to solve the entire scattering problem. 


6. Conclusions 

We have presented an acoustic solver for doubly-periodic gratings of smooth axisymmetric obstacles, that is spec¬ 
trally accurate with respect to all of its convergence parameters, enabling high (10-digit) accuracies to be reached 
efficiently even at medium to high frequencies. It combines an existing axisymmetric one-body solution using the 
method of fundamental solutions (MFS) with a new periodizing scheme based on auxiliary spherical harmonics, up¬ 
ward and downward radiation expansions, and collocation on the walls of one unit cell. The result avoids any singular 
surface quadratures, is efficient for grating periods up to a dozen wavelengths, handles highly resonant obstacles with¬ 
out extra cost, and is relatively simple to code. The one-body factorization cost is O(PN^) with a small prefactor; in 
the high-frequency limit this would scale as 0(k^). However, the solve cost per new incident wave is then dominated 
by only the 0(NP) linear cost of an FMM call, i.e. 0(k^). A key feature is that the periodization scheme is independent 
of the one-body solver, so that the latter could be replaced by an existing boundary integral based solver OllIslEl 
without modification, allowing the handling of more general shapes, corners, and edges. 

Although our main contribution is the new periodizing scheme, we also have contributed improvements to the 
one-body MFS scheme. Firstly, we use boundary complexification to place the MFS sources; the choice of distance 
parameter r may be made cheaply using simulations via the analogous 2D BVR Secondly we give a rigorous analysis 
of the use of the periodic trapezoid rule for evaluating Helmholtz ring kernels. Unlike in boundary integral equation 
methods (which require arbitrarily close source-target evaluations), all of our kernels can be evaluated in this way 
because of the source separation in the MFS. Since applying (A - BQ^C) is equivalent to applying the quasiperiodic 
Green’s function, our method as presented cannot work at Wood anomalies. However, because the Wood singularity 
is of inverse square-root form, the neighborhood around which a Wood anomaly causes a loss of accuracy is small, 
and this accuracy loss is consistent with backwards stability given the accuracy with which the incident wavevector 
is specified. We note that, were the solver to be extended to handle connected interfaces, robust handling Wood 
anomalies would then come for free (as in ED). 
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Figure 9: Convergence comparison of spherical harmonics (“sh”) vs proxy points (“pxy”) as the auxiliary basis for the periodizing scheme, for the 
Neumann scattering problem from a lattice of “smooth” obstacles as in Fig.|^a), at wavenumber k = 20. Note that the two schemes have nearly 
identical numbers of unknowns when p = N 2 . We have fixed the Rayleigh-Bloch degree as iVo = 12. 


Future extensions that would increase the range of application of the solver include: automating the choice of all 
convergence parameters given a shape and wavenumber k; including MFS source point choices that handle axisym- 
metric edges |[60]| ; generalizing to multilayer media, as in |[2T]| ; replacing the eight-neighbor FMM call by a low-rank 
compression scheme for applying the Aei^e interactions, which currently dominate the solution time IbOl : extension to 
the Maxwell equations EQI; and replacing the MFS scheme by a boundary integral solver, such as the recent 
fast direct solver to create a 3D version of lEl. 
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Appendix A. Comparing spherical harmonics vs proxy points as a periodizing basis 

Our periodizing scheme represents the contribution of the lattice of distant copies of the obstacle to the potential 
with a basis expansion for regular Helmholtz solutions in the unit cell. Here we compare two choices of this basis: the 
spherical harmonics up to degree p (as used in the rest of this work), vs proxy points placed along lines of 
longitude on a distant sphere of radius R. The latter has been proposed in the context of periodizing the 3D Laplace 
equation by Gumerov-Duraiswami (391. The radius R used is R = 3.5, chosen to optimize proxy point efficiency. 
The errors we test are and defined in Sec.[^ for the Neumann scattering from a grating of “smooth” obstacles 
from Fig. [^a) at wavenumber k = 20. The period is around lA. We show in Fig. the convergence results, and 
see that N 2 ~ 2p is needed to achieve similar errors in the two representations, i.e. the proxy scheme requires four 
times the number of unknowns required by the spherical harmonic scheme. (Note that efficiency differences due to 
the different proxy point arrangements discussed in (391 are small compared to this factor.) We also checked that the 
solution potential u at the point (0.9,0.9,0.9) agreed between the two methods at their converged parameters with a 
relative error of 1.2 x 10“^^. This is consistent with the errors shown in the figure, and provides some verification of 
the correctness of each scheme. In terms of speed, for comparable 12-digit errors, the SVD of the Q matrix for p = 58 
spherical harmonics takes 217 s, whereas using N 2 = 100 proxy points takes 1097 s, around five times slower. Thus 
we claim that spherical harmonics are by far the better choice for periodizing Helmholtz problems. 
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